Approximate Dynamic Programming via Iterated Bellman Inequalities

نویسندگان

  • Yang Wang
  • Stephen Boyd
چکیده

In this paper we introduce new methods for finding functions that lower bound the value function of a stochastic control problem, using an iterated form of the Bellman inequality. Our method is based on solving linear or semidefinite programs, and produces both a bound on the optimal objective, as well as a suboptimal policy that appears to works very well. These results extend and improve bounds obtained by authors in a previous paper using a single Bellman inequality condition. We describe the methods in a general setting, and show how they can be applied in specific cases including the finite state case, constrained linear quadratic control, switched affine control, and multi-period portfolio investment.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Approximate Dynamic Programming Strategy for Dual Adaptive Control

An approximate dynamic programming (ADP) strategy for a dual adaptive control problem is presented. An optimal control policy of a dual adaptive control problem can be derived by solving a stochastic dynamic programming problem, which is computationally intractable using conventional solution methods that involve sampling of a complete hyperstate space. To solve the problem in a computationally...

متن کامل

Characterization of facets of the hop constrained chain polytope via dynamic programming

In this paper, we study the hop constrained chain polytope, that is, the convex hull of the incidence vectors of (s, t)-chains using at most k arcs of a given digraph, and its dominant. We use extended formulations (implied by the inherent structure of the Moore-Bellman-Ford algorithm) to derive facet defining inequalities for these polyhedra via projection. Our findings result into characteriz...

متن کامل

Qr-tuning and Approximate-ls Solutions of the Hjb Equation for Online Dlqr Design via State and Action-dependent Heuristic Dynamic Programming

A novel approach for online design of optimal control systems based on QRtuning, state and action-dependent heuristic dynamic programming, and approximate-LS solutions of the Hamilton-Jacobi-Bellman (HJB) equation is the main concern of this paper. The QR-tuning for optimal control systems takes into account heuristic variations in the weighting matrices Q and R of the discrete linear quadratic...

متن کامل

Approximate Dynamic Programming based on Projection onto the (min, +) subsemimodule

We develop a new Approximate Dynamic Programming (ADP) method for infinite horizon discounted reward Markov Decision Processes (MDP) based on projection onto a subsemimodule. We approximate the value function in terms of a (min,+) linear combination of a set of basis functions whose (min,+) linear span constitutes a subsemimodule. The projection operator is closely related to the Fenchel transf...

متن کامل

Approximate dynamic programming via direct search in the space of value function approximations

This paper deals with approximate value iteration (AVI) algorithms applied to discounted dynamic programming (DP) problems. For a fixed control policy, the span semi-norm of the so-called Bellman residual is shown to be convex in the Banach space of candidate solutions to the DP problem. This fact motivates the introduction of an AVI algorithm with local search that seeks to minimize the span s...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010